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1.  INTRODUCTION 

Back  projection  (BP)  tomography  is  a  technique  for  reconstructing  an  image  from  its 
projections.  For  x-rays,  each  projection  sample  is  formed  by  an  integration  line  that  passes 
through  the  object  to  be  imaged.  The  value  of  the  projection  sample  is  the  absorption  of  the 
x-ray  along  the  integration  line.  A  projection  is  formed  by  a  set  of  parallel  integration  lines. 
The  corresponding  projection  line  is  orthogonal  to  the  integration  lines.  Different  projections 
are  obtained  by  rotating  the  integration  and  projection  lines  relative  to  the  object.  For  x-rays, 
the  reconstructed  image  is  a  two-dimensional  (2-D)  representation  of  absorption  at  each 
point  in  the  plane  of  rotation.  Displacement  of  the  plane  of  rotation  along  a  line  orthogonal  to 
the  plane  yields  a  set  of  2-D  reconstructions  that  can  be  combined  to  form  a  3-D  image 
(Macovski,  1983;  Rosenfeld  &  Kak,  1982). 

For  wideband  sonar,  radar,  and  ultrasound  with  fine  range  resolution,  the  echo  amplitude 
vs.  delay  (A-scan)  is  another  form  of  projection.  If  the  echo  amplitude  at  a  given  delay 
consists  of  the  sum  of  the  reflectivity  from  all  points  with  the  same  delay,  then  the  echo 
amplitude  is  a  projection  sample  formed  by  a  constant-delay  integration  surface  (e.g.,  a 
spherical  shell)  that  passes  through  the  object  to  be  imaged.  The  sum  of  point  reflectivities  is 
weighted  by  the  beam  pattern.  The  corresponding  projection  line  is  the  range  axis  or  the 
direction  along  which  the  beam  pattern  is  maximized.  Different  projections  are  obtained  by 
changing  aspect  angle  with  respect  to  the  target  (e.g.,  by  rotating  the  target  or  moving  the 
transmitter/receiver  along  a  path  orthogonal  to  the  propagation  direction).  Synthetic  aperture 
(SA)  processing  is  a  technique  for  constructing  an  image  of  an  object  from  A-scans  that  are 
obtained  at  different  aspect  angles,  i.e.,  from  a  sequence  of  projections  (Munson,  Obrien,  & 
Jenkins,  1983). 

If  the  beam  pattern  was  sufficiently  narrow  such  that  cross-range  (azimuth  or  elevation) 
resolution  was  as  fine  as  range  resolution,  then  SA  processing  would  be  unnecessary.  Many 
wideband  sonar/radar/ultrasound  systems,  however,  have  much  finer  resolution  in  the  range 
direction  than  in  cross-range.  SA  processing  is  a  technique  to  convert  such  fine  range 
resolution  into  fine  cross-range  resolution. 

Appendix  A  shows  that  SA  imaging  and  BP  tomography  can  be  made  identical  by  apply¬ 
ing  a  gradual  high-pass  filtering  function  to  A-scan  radar/sonar  data.  This  equivalence 
between  SA  and  BP  has  considerable  practical  and  theoretical  consequences.  These 
consequences  and  their  applications  are  discussed  in  the  following  sections. 

2.  THE  POINT  SPREAD  FUNCTION  AND  THE  RANGE, 
CROSS-RANGE  AMBIGUITY  FUNCTION  (RCAF) 

The  point  spread  function  of  an  imaging  system  is  the  representation  of  a  single  point  (a 
2-D  impulse)  by  the  system.  For  a  finite  aperture  system,  this  representation  is  a  smeared 
version  of  the  point.  This  smeared  version  is  obtained  by  2-D  convolution  of  the  input  point 
with  the  system  point  spread  function. 

For  a  finite  aperture  (spatially  band-limited)  system,  image  input  data  can  be  represented 
as  a  set  of  sample  points  with  different  amplitudes  at  a  sequence  of  2-D  locations.  For  a 
linear  system,  the  image  of  this  input  is  obtained  by  2-D  convolution  of  the  input  with  the 
system  point  spread  function.  The  image  is,  thus,  smeared  or  defocused  by  the  point  spread 
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function.  A  more  accurate  representation  of  the  input  data  is  obtained  by  using  an  imaging 
system  with  a  point  spread  function  that  more  closely  resembles  an  impulse. 

The  point  spread  function  of  a  BP  image  reconstruction  system  can  be  predicted  by  using 
the  equivalence  of  BP  and  SAS.  For  a  real  or  synthetic  array,  the  imaging  capability  of  a 
radar/sonar  system  is  described  by  the  range,  cross-range  ambiguity  function  (RCAF)  of  the 
system  (Altes,  1979).  An  ambiguity  function  is  the  response  of  a  receiver  (e.g.,  a  correlator 
or  likelihood  function  generator)  to  a  set  of  parameter  hypotheses  when  the  largest  response 
is  obtained  for  the  correct  parameter  hypotheses.  An  accurate  parameter  estimator  will  have 
a  large  response  for  the  correctly  hypothesized  parameter  values  and  a  small  response  for 
incorrect  hypotheses.  The  ambiguity  function  of  an  accurate  estimator  will  be  an  impulse¬ 
like  function  of  the  hypothesized  parameter  values.  If  the  parameters  are  range  and  cross¬ 
range  (azimuth  or  elevation),  then  a  point  target  that  is  concentrated  at  a  single  range,  cross¬ 
range  position  will  be  represented  by  a  RCAF  that  is  maximized  at  the  position  of  the  point 
target  when  this  position  is  correctly  hypothesized.  A  complete  image  is  obtained  by  moving 
the  hypothesized  range-azimuth  locations  over  the  image  plane.  This  movement  of  hy¬ 
pothesized  parameters  results  in  a  convolution  of  the  RCAF  with  a  2-D  reflectivity  function. 
The  image  of  a  scatterer  that  is  composed  of  many  point  targets  is  a  2-D  convolution  of  the 
reflector  distribution  with  the  RCAF. 

The  RCAF  of  a  radar/sonar  system  operates  in  the  same  way  as  the  point  spread  function 
of  an  optical  system;  2-D  convolution  of  the  input  distribution  with  the  RCAF  or  point 
spread  function  determines  the  system’s  representation  of  an  image.  The  RCAF  is,  thus,  the 
point  spread  function  of  a  radar/sonar  imaging  system.  For  a  single  transducer,  the  RCAF  is 
determined  by  the  autocorrelation  function  of  the  transmitted  signal  in  the  range  direction 
and  by  the  transducer  beam  pattern  in  the  cross-range  direction.  The  RCAF  can  be  measured 
by  moving  a  point  target  away  from  a  hypothesized  target  location  at  the  center  of  the 
transducer  beam  pattern  and  noting  the  receiver  response  to  the  target  at  different  range  and 
cross-range  locations  when  the  hypothesized  location  is  unchanged  .  This  measurement 
assumes  that  the  receiver  uses  correlation  (or  likelihood  function  synthesis)  in  both  range  and 
cross-range.  Correlation  in  the  range  direction  is  accomplished  with  temporal  matched 
filtering,  and  correlation  in  the  cross-range  direction  is  accomplished  with  delay-and-sum 
beam  forming  for  multi-element  arrays.  The  RCAF  of  a  single  transducer  will  be  called  the 
“range-azimuth  beam  pattern”  of  the  transducer/waveform  combination. 

For  delay-and-sum  beam  forming,  the  RCAF  for  a  real  or  synthetic  array  of  transducers  is 
the  sum  of  the  individual  transducer  range-azimuth  beam  patterns  at  the  location  of  an 
imaged  point.  For  a  wideband,  high-resolution  sonar,  any  one  beam  pattern  can  be  approxi¬ 
mated  by  a  line  segment  that  is  orthogonal  to  the  propagation  direction.  The  width  (range 
extent)  of  the  line  segment  is  the  range  resolution  cell  of  the  sonar,  and  the  line  segment 
length  (azimuth  extent)  is  the  transducer  beam  width.  The  beam  width  is  typically  much 
wider  than  the  range  resolution  cell,  yielding  a  line-like  range-azimuth  beam  pattern.  (If 
azimuth  or  cross-range  resolution  were  as  good  as  range  resolution,  there  would  be  no  need 
for  back  projection  or  synthetic  aperture  processing;  the  environment  could  be  imaged  from 
a  single  aspect  angle.)  The  line  segments  corresponding  to  all  the  real  or  synthetic  transducer 
locations  intersect  at  a  point  being  imaged  (the  hypothesized  point  target  position),  creating 
an  asterisk-like  pattern.  This  asterisk-like  pattern  is  the  sum  of  rotated  range-azimuth  beam 
patterns  and  is  the  point  spread  function  or  RCAF  of  a  synthetic  multistatic  system  composed 
of  multiple  transducers  at  different  locations. 


The  SAS  or  BP  image  is  the  actual  reflectivity  distribution  convolved  or  smeared  by  the 
asterisk-like  point  spread  function  (the  RCAF).  The  peak-to-sidelobe  ratio  of  the  point 
spread  function  is  a  measure  of  how  well  the  point  spread  function  resembles  an  impulse, 
and  is,  thus,  associated  with  image  quality.  For  a  synthetic  aperture  sonar  or  radar,  the  peak- 
to-sidelobe  ratio  is  the  number  of  different  transducers  or  aspect  angles  (i.e.,  the  sum  of  all 
the  rotated  line  segments  at  their  intersection  divided  by  the  height  of  one  of  the  line 
segments). 


3.  SYSTEM  TRADEOFFS 

The  effect  of  restricting  the  observation  interval  to,  for  example,  90  degrees  rather  than 
360  degrees,  can  be  assessed  by  analyzing  the  effect  of  such  a  restriction  on  the  RCAF.  If  the 
bandwidth  and  angular  sampling  interval  remain  unchanged,  then  reducing  the  observation 
interval  by  a  factor  of  four  should  reduce  the  peak-to-sidelobe  ratio  by  a  factor  of  four.  The 
asterisk  pattern  is  also  affected;  the  sidelobes  are  restricted  to  a  “bow  tie”  pattern  that  is  90 
degrees  wide  on  each  side.  The  reduction  in  peak-to-sidelobe  level  results  in  smearing  or 
defocusing  of  the  image.  To  counteract  this  effect,  more  line  segments  within  the  bow  tie  can 
be  added  by  sampling  more  often  in  angle,  viz.,  four  times  as  often.  The  peak-to-sidelobe 
ratio  will  then  be  the  same  as  before,  provided  that  the  line  segments  are  sufficiently  narrow 
in  width  (range  extent)  so  as  not  to  overlap  significantly  (except  at  the  center  of  the  asterisk 
or  bow  tie)  when  they  are  rotated  by  only  one-quarter  of  the  initial  angular  sampling  interval. 
To  assure  that  additional  overlap  does  not  occur,  the  line-segment  widths  should  be  narrowed 
by  increasing  the  bandwidth  by  a  factor  of  four. 

Given  an  image  of  independent  point  scatterers  that  is  obtained  over  a  360-degree  interval, 
an  equivalent  image  can  theoretically  be  obtained  over  an  observation  interval  of  360/N 
degrees,  provided  that  the  angular  sampling  rate  and  the  signal  bandwidth  are  both  multi¬ 
plied  by  N.  This  result  is  consistent  with  an  animal  sonar  system  that  attempts  to  obtain  a 
high-quality  acoustic  image  from  echo  data  that  are  observed  over  a  limited  observation 
angle.  The  predicted  behavior  is  to  make  the  bandwidth  as  wide  as  possible  and  to  increase 
the  angular  sampling  rate.  Echolocating  dolphins  move  back  and  forth  near  an  object  that 
they  are  trying  to  identify  (Gisiner,  1994),  while  emitting  echolocation  clicks  at  an  extremely 
high  repetition  rate  (Moore  et  al.,  1990;  Roitblat  et  al.,  1991).  This  behavior  increases  the 
angular  sampling  frequency  for  a  restricted  observation  interval.  The  dolphin’s  echolocation 
bandwidth  is  also  effectively  increased  via  higher  SNR  and  lower  high-frequency  attenuation 
associated  with  proximity  to  a  target. 

Ambiguity  function  analysis  thus  indicates  a  tradeoff  between  angular  observation  interval 
and  bandwidth  for  target  imaging.  A  smaller  aspect  interval  can  be  compensated  by 
increased  bandwidth,  along  with  increased  angular  sampling  rate.  At  long  range,  increased 
angular  sampling  rate  can  be  obtained  by  transmitting  a  sequence  of  decorrelated  signals 
(e.g.,  linear-frequency-modulated  signals  with  different  chirp  rates). 

4.  ROTATED  WAVELET  ANALYSIS 

Another  practical  problem  is  obtaining  a  high-quality  image  with  sparse  angular  sampling 
from  a  relatively  small  number  of  aspect  angles.  In  terms  of  the  point  spread  function  or 
RCAF,  the  problem  is  to  design  an  imaging  system  with  an  impulse-like  point  spread 
function,  even  though  aspect  angles  may  be  separated  by,  for  example,  30  degrees.  This  goal 
can  be  accomplished  by  using  range-azimuth  beam  patterns  with  a  particular  sidelobe 
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structure.  Negative  sidelobes  occur  naturally  with  finite-length  transducers  and  with  signals 
that  have  no  power  at  zero  frequency.  The  object  is  to  design  these  sidelobes  so  that  a  sum  of 
rotated  range-azimuth  beam  patterns  is  impulse-like.  Each  beam  pattern  should  be  cancelled 
by  the  negative  sidelobes  of  its  rotated  neighbors  except  at  the  point  of  rotation,  which 
corresponds  to  the  center  of  each  beam  pattern.  The  2-D  Fourier  transform  of  an  impulse  is  a 
constant  over  the  system  bandwidth.  The  Fourier  sum  transforms  of  the  rotated  range- 
azimuth  beam  patterns  (after  space-time  matched  filtering)  should,  thus,  be  constant  over  the 
system  bandwidth.  This  design  criterion  is  similar  to  wavelet  analysis,  except  that  the  basis 
functions  are  2-D  range-azimuth  beam  patterns  that  are  rotated  rather  than  scaled.  The 
required  range-azimuth  beam  patterns  are  easily  obtained  via  2-D  Fourier  transform  analysis, 
and  they  resemble  the  measured  versions  of  such  patterns  that  are  observed  around  an 
echolocating  dolphin  (Altes,  1995). 

5.  PARTIAL  IMAGE  RECONSTRUCTION  FROM  PROJECTIONS 

It  is  not  clear  how  to  construct  only  part  of  an  image  with  conventional  back  projection, 
but  the  equivalence  to  SAS  makes  such  construction  straightforward.  To  investigate  part  of  a 
scene,  the  synthetic  array  focuses  on  each  point  contained  in  the  area  of  interest  and 
disregards  all  other  areas.  Focusing  is  accomplished  by  delay-and-sum  beamforming. 

6.  FEATURE  IMAGES 

The  usual  SAR/SAS/BP  radar/sonar  image  is  a  representation  of  target  reflectivity  as  a 
function  of  position  (pixel  location).  Other  features,  however,  are  important  for  target 
identification  and  target/clutter  discrimination.  Such  features  include  the  echo  bandwidth, 
echo  center  frequency,  and  texture  (e.g.,  the  relative  number  of  large  and  small  maxima  in  a 
small  interval  surrounding  a  given  location,  etc).  These  features  can  be  extracted  from  each 
echo  and  represented  as  a  function  of  delay,  yielding  a  generalized  A-scan  display  of  feature 
value  vs.  range.  These  generalized  projections  can  be  combined  to  form  a  high-resolution 
SAS/BP  image  of  the  feature  value  as  a  function  of  pixel  location.  Feature  images  can 
accentuate  or  deaccentuate  clutter  relative  to  a  standard  reflectivity  image,  and  difference 
images  that  greatly  enhance  signal-to-clutter  ratio  (SCR)  can  be  constructed.  Some  examples 
are  presented  in  section  8. 

Different  feature  images  comprise  spatially  registered  maps  of  the  acoustic  environment. 
Combining  these  images  (e.g.,  by  forming  weighted  differences  between  them)  is  equivalent 
to  forming  a  composite  image  from  different  features  at  the  same  location.  This  type  of 
representation  and  combination  is  found  in  the  superior  colliculus  of  mammals  (Drager  & 
Hubei,  1975)  and  the  optic  tectum  of  reptiles  (Hartline,  Kass,  &  Loop,  1978)  and  fish 
(Bastian,  1986).  Overlaid  neuronal  feature  maps  are  derived  from  different  sensors  (e.g., 
vision  and  infrared  in  the  rattlesnake).  Large  neurons  penetrate  the  spatially  registered  maps 
to  combine  features  at  a  given  environmental  location.  These  neurons  implement  a  biological 
version  of  sensor  fusion. 

7.  THREE-DIMENSIONAL  REPRESENTATIONS 

Different  projections  are  obtained  when  an  object  is  rotated  relative  to  a  trans¬ 
miter/receiver  (e.g.,  when  the  object  or  the  radar/sonar  platform  moves  in  a  direction  that  is 
orthogonal  to  the  propagation  direction).  In  many  practical  situations,  the  platform  is  above 
the  plane  of  rotation.  This  displacement  in  elevation  allows  for  a  3-D  representation  of  target 
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reflectivity.  To  construct  a  2-D  image,  the  elevation  of  the  corresponding  image  plane  must 
be  specified.  A  sequence  of  spatially  registered  2-D  images  is  obtained  for  different  specified 
elevations.  These  images  can  be  included  in  the  set  of  spatially  registered  feature  maps  that 
are  combined  to  form  a  composite  image  with  increased  SCR.  Elevation  can,  thus,  be  treated 
as  another  feature  in  a  set  of  feature  maps. 

Extra  information  from  elevation  can  be  used  to  adaptively  improve  focusing  and  to 
construct  a  3-D  surface  that  represents  the  physical  shape  of  tin  object  as  it  would  be 
perceived  with  vision.  Such  a  surface  is  different  from  a  representation  of  reflectivity  as  a 
function  of  position,  and  it  allows  direct  comparison  with  visual  representations  (e.g., 
photographs)  of  objects.  A  visual  analogue  would  be  very  useful  to  an  animal  that  tries  to 
perform  sensor  fusion  by  combining  spatially  registered  feature  maps  from  vision  and 
echolocation.  There  is  some  speculation  that  dolphins  may  be  capable  of  such  a  vision-like 
target  representation  (Pack  &  Herman,  1995),  although  the  issue  of  the  cognitive  representa¬ 
tion  of  targets  formed  by  echolocating  dolphins  remains  open  to  debate  (Helweg  et  al.,  1996; 
Harley  et  ah,  1995;  Roitblat  et  al.,  1995),  and  the  lay  concept  of  "seeing  with  echolocation" 
remains  unsubstantiated.  However,  a  vision-like  target  representation  is  well-suited  to  human 
observers  and  could  have  application  to  improvement  of  MCM  performance. 

8.  EXAMPLES  AND  APPLICATIONS 

Sonar  echoes  from  a  rotating  ONI-certified  ROCKAN  mine  simulator  were  collected  in 
Lake  Travis  by  the  Applied  Research  Laboratory  at  the  University  of  Texas  in  Austin,  Texas. 
When  seen  from  above,  the  target  resembles  a  trapezoid  with  small  fins  or  tabs  at  the  comers 
of  the  base  (figure  1).  The  target  was  suspended  at  a  45°  angle  so  that  it  was  not  totally 
contained  within  the  plane  of  rotation.  The  sonar  receiver  was  7  m  above  the  plane  of 
rotation  and  was  at  a  depth  of  3  m.  The  center  of  rotation  was  31m  from  the  sonar  trans¬ 
ducer.  The  signal  was  a  broadband  bottlenose  dolphin  echolocation  click  typical  of 
echolocation  signals  used  by  U.S.  Navy  dolphins  (Moore,  1997).  The  digitized  click  and  its 
spectrum  is  presented  in  figure  la.  Echoes  were  digitized  at  a  sampling  frequency  of  500 
kHz,  and  observation  angles  were  separated  by  approximately  0.36  degrees. 

The  echo  from  each  aspect  angle  was  processed  with  a  filter  that  yielded  a  minimum 
mean-square  error  estimate  of  the  target  impulse  response.  This  filter  was  a  cascade  of  an 
inverse  filter  and  a  Wiener  filter  (Turin,  1957;  Altes,  1977).  For  frequency  domain  compo¬ 
nents  with  high  signal-to-noise  (SNR),  the  filter  transfer  function  approximates  the  inverse  of 
the  signal  spectrum.  At  low  SNR,  the  filter  transfer  function  approximates  a  matched  filter. 
The  filtered  echoes  from  different  aspects  are  A-scans  or  projections  that  are  combined  to 
form  a  SAS/BP  image. 

Figure  2  shows  the  resulting  SAS/BP  images  before  image-processing  techniques  (such  as 
adaptive  thresholding  and  contrast  enhancement)  were  applied.  These  images  are  sufficient 
for  demonstration  purposes,  but  they  do  not  exploit  the  full  resolution  capability  of  the 
system.  Each  image  pixel  incorporates  20  echo  samples  and,  thus,  represents  a  volume 
element  that  is  3  cm  on  a  side.  Every  tenth  echo  was  used,  so  the  observation  angles  were 
separated  by  approximately  3.6  degrees.  The  elevation  of  the  image  plane  in  figure  2  was 
chosen  to  maximize  the  variance  of  the  pixel  levels  in  the  2-D  image  and,  thus,  maximize  a 
measure  of  overall  sharpness  or  focus.  The  cloudiness  of  the  image  is  associated  with  volume 
clutter  (reverberation)  that  usually  is  caused  by  small  air  bubbles.  In  medical  ultrasound, 
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such  clutter  is  much  more  pronounced,  and  is  called  “speckle.”  Volume  reverberation  in 
sonar  should  become  much  stronger  for  buried  objects  and  for  turbulent  water  that  contains 
more  air  bubbles  and  other  particles. 

The  two  images  in  figure  2  appear  to  be  identical,  but  actually  are  slightly  different.  The 
left-hand  image  was  derived  from  the  absolute  values  of  the  echo  samples.  Each  echo  has  a 
complex  representation  with  the  real  part  corresponding  to  the  echo  itself  and  the  imaginary 
part  corresponding  to  the  Hilbert  transform  of  the  echo.  The  envelope  of  the  echo  is  the 
magnitude  of  the  complex  representation  (i.e.,  the  square  root  of  the  sum  of  the  squares  of 
the  real  and  imaginary  parts).  For  a  point  target,  the  echo  envelope  is  broader  than  the 
absolute  value  of  the  real  part  of  the  echo.  For  uniformly  distributed  clutter,  this  broadening 
should  result  in  a  slight  decrease  in  SCR.  Such  a  difference  can  be  exploited  by  subtracting  a 
weighted  version  of  one  image  from  the  other  and  setting  negative  pixel  values  to  zero.  The 
resulting  difference  image  can  enhance  the  target  SCR,  as  on  the  left-side  of  figure  3,  or 
suppress  it  as  on  the  right-side  of  figure  3.  The  negative  contrast  image  on  the  right-side  of 
figure  3  is  often  used  in  medical  ultrasound  and  may  prove  to  be  important  for  finding  buried 
sonar  targets. 

SCR  also  can  be  enhanced  or  suppressed  by  using  feature  images.  Rough  surfaces  and 
volume  reverberation  can  be  discriminated  from  smooth  surfaces  by  counting  the  number  of 
relatively  small  echo  maxima  in  a  short  delay  interval.  This  number  tends  to  be  larger  for 
rough  surfaces  and  for  volume  reverberation.  Figure  4  shows  difference  images  that  were 
constructed  from  a  roughness  feature  image  and  a  conventional  reflectivity  image.  Volume 
clutter  was  deaccentuated  and  SCR  was  increased  by  subtracting  a  weighted  roughness 
image  from  the  conventional  image  and  setting  all  negative  pixels  to  zero.  A  negative 
contrast  image  that  accentuates  clutter  and  makes  the  target  disappear  was  obtained  by 
subtracting  a  weighted  conventional  image  from  the  roughness  image  and  setting  all  negative 
pixels  to  zero.  The  roughness  feature  was  calculated  by  defining  a  “small”  echo  maximum  to 
have  an  amplitude  that  is  less  than  one-tenth  the  amplitude  of  the  largest  echo  maximum  in  a 
short  delay  interval.  The  delay  interval  consisted  of  10  echo  samples  on  either  side  of  each 
imaged  point.  Because  each  image  pixel  incorporated  20  of  the  original  echo  samples,  the 
interval  for  feature  measurement  is  uniquely  associated  with  each  image  pixel. 

The  target  did  not  lie  within  a  single  constant-elevation  plane,  and  many  parts  of  the  target 
were  not  well  focused  when  an  elevation  with  best  overall  focus  was  selected  for  the  imaging 
system.  The  focus  at  each  point  was  optimized  by  constructing  a  composite  image  consisting 
of  pixels  from  many  image  planes  at  different  elevations.  At  each  location  in  the  image,  the 
pixel  with  the  best  individual  focus  was  selected  from  the  different  representations  of  the 
same  pixel  in  image  planes  at  various  elevations.  For  an  image  plane  at  a  specified  elevation, 
a  measure  of  individual  pixel  focus  was  the  sum  of  the  magnitudes  of  the  differences  between 
the  pixel  value  and  the  values  of  surrounding  pixels.  This  measure  was  used  to  construct  the 
composite  image  on  the  right-hand  side  of  figure  5.  The  image  on  the  left-hand  side  of  figure 
5  is  a  conventional  single-elevation  image  (chosen  from  the  elevation  that  maximizes  an 
overall  focus  measure)  that  was  passed  through  an  adaptive  threshold.  Pixels  with  values  that 
were  less  than  the  threshold  value  were  set  to  zero.  The  right-hand  image  is  constructed  by 
selecting  pixels  with  best  individual  focus  from  the  left-hand  image  and  from  other  images 
constructed  with  different  elevation  hypotheses.  The  resulting  image  appears  to  be  better 
focused  than  the  conventional  image. 
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Figure  1 .  The  target,  an  ONI-certified  ROCKAN  very  shallow  water  (VSW)  mine  simulator,  is 
presented  in  the  top  panel.  The  outer  shell  is  fiberglass.  The  bottom  panels  depict  the 
dolphin  biosonar  click  used  to  ensonify  the  ROCKAN  and  its  associated  linear  spectrum. 
Click  duration  was  approximately  70  msec,  with  a  peak  frequency  of  approximately  110  kHz. 


7 


Figure  2.  Two  SA  or  BP  representations  of  the  target  before  adaptive  thresholding  and 
contrast  enhancement. 


Figure  3.  Target  and  clutter  enhancement  based  on  differences  between  the  representations 
in  figure  2. 


Adaptive  focusing  as  in  figure  5  can  be  used  to  partially  compensate  for  nonhomogeneous 
propagation  media  and  other  adverse  effects.  Image  planes  at  multiple  elevations  provide 
redundancy  that  can  be  exploited  with  various  image  fusion  or  filtering  criteria.  For  a  buried 
target,  adaptive  combining  of  pixels  from  image  planes  at  different  elevations  can  be  used  to 
eliminate  those  elevations  where  SCR  is  comparatively  low.  A  weighted  sum  of  pixels  from 
different  closely  spaced  elevations  may  be  preferable  to  selecting  a  pixel  from  one  specific 
elevation. 

Image  planes  at  multiple  specified  elevations  can  also  be  used  to  estimate  the  physical 
shape  of  the  target  surface  and,  thus,  to  construct  a  vision-like  target  representation.  This 
application  depends  on  a  criterion  for  recognizing  a  surface  pixel.  At  present,  the  best  surface 
recognition  criterion  maximizes  reflectivity  while  constraining  the  gradient  of  reflectivity  to 
be  small  in  three  dimensions.  This  criterion  jointly  maximizes  reflectivity  and  a  smoothness 
measure.  Figure  6a  shows  a  composite  image  that  is  constructed  by  selecting  pixels  with 
maximum  surface  recognition  criterion  from  40  different  elevations.  The  corresponding 
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elevations  are  saved  and  used  to  construct  a  3-D  surface  that  represents  the  physical  surface 
of  the  target.  Figures  6b-6d  show  the  resulting  surface  estimate  from  slightly  different  aspect 
angles.  The  whiteness  of  the  surface  is  the  reflectivity  shown  in  figure  6a;  the  height  of  the 
surface  is  determined  by  the  elevations  of  pixels  that  maximize  the  surface  recognition 
criterion. 


Figure  4.  Difference  images  constructed  from  a  conventional  image  of  target  reflectivity  and 
a  feature  image  of  target  roughness.  A  “smoothness”  feature  image  is  constructed  by 
subtracting  a  weighted  roughness  feature  image  from  a  conventional  image  and  setting 
negative  pixel  values  to  zero. 


Figure  5.  A  conventional  image  from  the  elevation  that  maximizes  an  overall  focus  measure 
(left)  and  a  composite  image  composed  of  pixels  from  multiple  elevations,  where  a  focus 
measure  for  each  individual  pixel  is  maximized  (right). 

Figure  6  illustrates  the  possibility  of  using  3-D  acoustic  imaging  data  to  construct  a  physi¬ 
cal  replica  of  the  target  surface  as  it  would  be  perceived  visually.  Such  a  representation 
should  be  extremely  useful  for  human  sonar  operators.  If  echolocating  animals  were  able  to 
construct  such  a  model,  then  acoustic  data  could  be  used  to  directly  infer  the  visual  shape  of 
an  object  as  suggested  in  Bastian  (1986);  c.f.  Helweg  et  al.  (1996);  Harley  et  al.  (1996);  and 
Roitblat  et  al.  (1995). 
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Figure  6.  Figure  6a  is  a  composite  of  pixels  chosen  from  images  at  different  elevations  (z). 
Each  pixel  value  in  this  image  is  the  maximum  value  of  a  surface  recognition  criterion 
(maximized  over  z).  Figures  6b  through  6d  are  composite  3-D  surface  images  seen  from 
slightly  different  aspects.  Surface  height  (z)  is  chosen  from  multiple  images  at  different 
elevations;  z  is  the  elevation  that  maximizes  the  surface  recognition  criterion  at  a  given 
location  (x,y)  in  the  image  plane.  The  whiteness  of  the  surface  equals  the  corresponding 
value  of  the  surface  recognition  criterion  and  is  the  same  as  in  6a.  The  goal  is  to  use  3-D 
acoustic  imaging  information  to  construct  a  version  of  the  target  that  resembles  a  visual 
representation  of  the  target’s  surface. 
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9.  ALGORITHM  EFFICIENCY 

Synthetic  aperture  and  back  propagation  yield  identical  images,  but  typically  use  different 
processing  algorithms.  There  appears  to  be  no  advantage  to  using  BP  algorithms  for  SA 
imaging  if  appropriate  quantities  are  pre-computed  for  synthetic  aperture  so  as  to  avoid 
needless,  iterative  computation  of  the  same  quantity.  When  such  preprocessing  is  used,  a  PC 
with  a  166  MHz  clock  rate  can  process  a  new  echo  every  3  sec.  This  rate  approaches  real¬ 
time  computation,  especially  if  faster,  dedicated  processors  are  used.  The  image  is  sequen¬ 
tially  synthesized  by  this  process,  and  it  can  be  evaluated  before  the  process  is  completed. 
Such  evaluation,  along  with  the  capability  to  estimate  part  of  an  image,  should  also  save  time 
and  increase  area  coverage  rate. 

10.  CONCLUSIONS 

The  equivalence  of  back  projection  and  synthetic  aperture  imaging  can  be  exploited  to 
obtain  the  following  advantages: 

1.  Theoretical  tradeoff  predictions  can  be  made  by  considering  the  effect  of  decreased 
angular  observation  interval  or  angular  sampling  rate  on  the  peak-to-sidelobe  ratio  of 
the  point  spread  function.  These  predictions  also  indicate  the  payoff  associated  with 
large  signal  bandwidth. 

2.  If  transducers  and  transmitted  waveforms  are  designed  to  make  the  range-azimuth 
beam  pattern  the  same  as  a  rotated  wavelet  basis  function,  then  high-quality  images 
can  be  obtained  with  a  substantial  reduction  in  angular  sampling  rate  (e.g.,  with 
observations  separated  by  30  degrees). 

3.  Images  can  be  constructed  sequentially,  and  there  is  no  need  to  synthesize  a  complete 
image  when  only  part  of  the  image  is  of  interest. 

4.  “Feature  images”  can  be  constructed  by  replacing  the  usual  reflectivity  feature  by 
other  localized  features  such  as  roughness  measures  and  spectral  descriptors. 

5.  Feature  images  can  be  combined  with  conventional  reflectivity  images  to  suppress  or 
enhance  clutter  or  targets  with  different  properties  (e.g.,  rough  vs.  smooth). 

6.  When  the  sonar  transducer  is  located  outside  the  plane  of  target  rotation,  elevation 
information  is  added  to  sonar  images,  allowing  for  3-D  imaging. 

7.  Elevation  information  can  improve  an  image  by  selecting  the  best-focused  version  of 
the  same  pixel  from  image  planes  at  different  elevations. 

8.  Elevation  information  can  be  used  to  create  a  vision-like  image  of  an  object’s  surface 
by  searching  in  elevation  to  find  the  pixel  that  maximizes  a  surface  selection  crite¬ 
rion  and  by  storing  the  elevation  at  which  the  pixel  was  found.  These  elevation  points 
define  a  surface  that  can  be  displayed  to  a  human  observer  or  used  in  an  automatic 
pattern  recognition  system.  A  composite  representation  can  be  used  to  illuminate  the 
surface  with  the  corresponding  reflectivity,  thus  identifying  the  parts  of  the  surface 
that  are  most  reflective.  Such  information  can  be  used  to  design  “stealthy”  targets  by 
identifying  strongly  reflective  surface  components. 
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9.  Target  images  that  are  constructed  with  closely  spaced  hypothesized  elevations  can 
be  combined  via  linear  and  nonlinear  filters  so  as  to  maximize  SNR  and  SCR  for 
buried  objects  and  turbulent  propagation  media. 
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APPENDIX  A 

Back  Projection  and  Synthetic  Aperture  Processing 

In  order  to  compare  back  projection  with  synthetic  aperture  imaging,  it  is  helpful  to  review  some 
properties  of  two-dimensional  Fourier  transforms.  The  first  property  is  the  expression  for  the  2-D 
Fourier  transform  in  cylindrical  coordinates.  In  Cartesian  coordinates,  the  2-D  Fourier  transform  is 

f(x,y)  =  (27t)~2°j  j F((Ox , coy ) txp{j(d)xx -HOyy)]d(Oxd(Oy .  (Al) 


In  cylindrical  (r,9)  frequency  domain  coordinates,  (ox  =  rcosO  ,(Oy  =  rsin# ,  and 

F((Ox,(Oy )  =  F(rcos#,rsin0)  s  F^,(r,0).  (A2) 

Cylindrical  (p,<{))  coordinates  in  the  spatial  domain  are  such  that  x  =  pcos0,  y  =  p  sin  0 ,  and 

f(x,y )  =  /  (pcos0,psin0)  s  /o.;(p,0).  (A3) 

Substituting  equations  (A2)  and  (A3)  into  equation  (Al),  changing  variables ,  and  noting  that  the 
Jacobian  of  the  joint  change  of  variables  (Ox  -  r  cos#  ,  (Oy  =  rsinO  is 

d(Ox/dr  d(Oy  /  dr 
dcox/dd  d(Oy/dd 

yields 

7T/2 

f^(p,0)  =  (2ny2  J  J Fcyi(r,d) zxp[jrp(cos0cos(p  +  sin6s'm(j))]\r]drdd.  (A5) 

-it 1 2 

The  desired  expression  for  the  2-D  Fourier  transform  in  cylindrical  coordinates  is  obtained  by  using 
the  identity 

cos0cos0  +  sin0sin0  =  cos(0-  (j>)  (A6) 

in  equation  (A5),  which  results  in 

7tl  2 

/o/(P>0)  =  (27r)"2  J  J  Fcy,  (r,0)  exp[jrpcos(0  -  <j>)]\r\drd6.  (Al) 

-nil  -co 

Rotation  of  an  image  in  the  x,y  plane  corresponds  to  a  similar  rotation  of  the  2-D  Fourier  transform  of 
the  image  in  the  co,co  plane.  This  property  follows  easily  from  equation  (Al).  Rotation  in  the  x,y  plane 
by  y  radians  transforms  f  yl(p,<J))  to  f  yl(p.<t>+Y)-  Replacing  <|>  by  <|>+Y  on  the  right-hand  side  of  equation  (Al) 
and  changing  variables  by  letting  0’=0-y,  the  right  side  becomes  the  2-D  Fourier  transform  of  Fcyl(r,0+y). 
It  follows  that 


cos#  sin# 
-rsin#  rcos#! 


=  r 


(A4) 


yp’ft-Y)  ^Fcyl(r,e+y) 


(A8) 


A-1 


or 


f(xcosy-ysinY,ycosY+xsinY)  <-4F(coxcosY-coysinY,w  cosY+wxsinY),  (A9) 

where  the  double  arrow  indicates  a  2-D  Fourier  transform  pair. 

Another  property  of  2-D  Fourier  transforms  is  that  the  projection  of  f(x,y)  onto  the  x-axis  is  the  1-D 
inverse  Fourier  transform  of  F(cox,0).  The  projection  of  f(x,y)  onto  the  x-axis  is 

^oW5  °\f{x,y)dy.  (A10) 

—  oo 

Integrating 

f(x,y)  =  (2ny2  J  J  Ficd; ,  0)y )  exp[j(coxx  -KDyy)]dcDxda>y 

with  respect  to  y  yields 

oo  oo 

p0  (x)  =  \f(x,  y)dy  =  (2k)-1  J  F(cox,0)y )  exp  (jcoxx)8(coy  )dcoxdcoy 


=  (2k)  1  J F(cox ,0) exp (jcoxx)dcox ,  (All) 

which  is  the  1-D  inverse  Fourier  transform  of  F(cox,0).  It  follows  that 

OO 

J  P0  (jc)  exp  (-jcox  x)dx  =  F((Ox  ,0).  (A  12) 

— OO 

The  above  projection  property  can  be  generalized  to  rotated  versions  of  f(x,y).  Rotating  f(x,y)  by  0 
radians  yields 

fe(x,y)  =  /(xcos0-ysin0,ycos0  +  xsin0).  (A13) 

It  follows  from  equation  (A9)  that  the  2-D  Fourier  transform  of  fe(x,y)  is  a  similarly  rotated  version  of 
F(C0x,C0y): 

fg(x,y)  <->  Fe((0x,0)y)  =  F(0)x  cos 6  —  0)y  sin 6,Q)X  cos 6  +  cox  sin0).  (A14) 

Letting  Pe(x)  denote  the  projection  of  fe(x,y)  onto  the  x-axis,  i.e., 

p6  (x)  -]fe  (*.  y)dy  >  (A15) 
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equations  (A10)  through  (A12)  imply  that  the  1-D  Fourier  transform  of  Pe(x)  is  Fe(©x,0).  Using  r  instead 
ofto, 

oo 

J  Pg(x)exp(-jrx)dx  =  Fe(r,  0)  =  F(rcos6,r  sin0)  =  Fcyl{r,6).  (A16) 

—  OO 

The  projection  of  a  rotated  version  of  the  image,  f(x,y),  can  be  Fourier-transformed  in  one  dimension 
to  obtain  the  2-D  Fourier  transform  of  the  image  in  cylindrical  coordinates,  evaluated  along  a  constant-0 
slice  in  the  frequency  domain.  This  result  is  known  as  the  projection-slice  theorem  (Munson  et  al.  1983). 
It  implies  that  a  sequence  of  projections  of  incrementally  rotated  images  can  be  used  to  obtain  a  sequence 
of  constant-0  slices  of  the  2-D  Fourier  transform  of  the  image  in  cylindrical  coordinates.  The  image  can 
be  reconstructed  from  its  projections  by  computing  an  inverse  2-D  Fourier  transform  in  cylindrical 
coordinates,  as  in  equation  (A7).  This  form  of  reconstruction  is  known  as  back  projection. 

To  obtain  a  more  explicit  expression  for  the  reconstructed  image  in  terms  of  its  projections,  equation 
(A16)  is  solved  for  Pe(x)  by  taking  the  inverse  1-D  Fourier  transform  of  both  sides  of  the  equation. 


Pe(x)  =  (2n)x\  Fcyl (r , 6) exp(jrx)dr .  (A17) 

A  gradual  high-pass  filter  (similar  to  differentiation  without  the  corresponding  phase  shift)  can  be 
applied  to  the  projection,  yielding 

oo 

Pe.HpM-^y1  J  FcyI(r,6)exp(jrx)\r\dr.  (A18) 

The  integral  on  the  right-hand  side  of  equation  (A  18)  is  contained  in  the  2-D  inverse  Fourier  transform 
expression,  equation  (A7),  in  the  form 

OO 

(27 r)-1  \  F^ir^expijrpcosid  -  <p)]\r\dr  =  Pe  HP[pcos(d  -  <t>)].  (A19) 

—  OO 

Substituting  equaiton  (A  19)  into  equation  (A7)  yields 

jr/2 

/*(P,*)  =  (27T)"1  J  PeMP[pc  os(d -<t>))dO.  (A20) 

-JCl 2 

The  original  image,  fcyl(p  ,<J> ),  can  be  reconstructed  by  summing  high-pass  filtered  projections.  To 
obtain  the  image,  f(x,y),  in  Cartesian  coordinates,  recall  from  equation  (A3)  that 

fcy,(p,<t>)  =  f  (pcos0,psin0)  =  f(x,y ) 

and  from  equation  (A6)  that 

cos(0  -<t>)=  cos  6  cos  0  +  sin  6  sin  (j). . 
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It  follows  that  equation  (A20)  can  be  written  as 

nil 

/(pcos0,psin0)  =  (2 n)~x  J  Pe HP[(pcos(j))cos6  +  (p sin0) sin6]d0  (A21) 

-nil 


or 


Till 

f(x,y)  =  (2jtyl  j  Pe  HP(xcosd  +  ysind)dd .  (A22) 

-71/2 

Back  projection  algorithms  typically  utilize  the  projection-slice  theorem  to  obtain  the  2-D  Fourier 
transform  of  the  image  in  cylindrical  coordinates.  The  image  is  then  reconstructed  via  a  2-D  inverse 
Fourier  transform  operation.  The  equivalent  expression  in  equation  (A22),  however,  is  useful  for 
illustrating  the  similarity  between  back  projection  and  synthetic  aperture  processing. 

For  radar/sonar/ultrasound  processing,  suppose  that  a  transducer  is  placed  on  the  negative  x-axis.  A 
target  is  rotated  about  the  origin  of  the  coordinate  system,  and  the  range  is  defined  to  be  zero  at  the 
center  of  rotation.  Projections  of  the  target  are  obtained  by  rotating  the  target  clockwise  and  recording 
reflectivity  vs.  range  (A-scan)  data  at  each  rotation  after  filtering  to  obtain  an  estimate  of  the  target 
impulse  response  (e.g.,  matched  filtering).  The  integration  surfaces  for  each  projection  correspond  to 
points  with  constant  delay  (e.g.,  spherical  shells).  The  thickness  of  the  integration  surfaces  or  shells  are 
determined  by  the  range  resolution  cell  of  the  system  (i.e.,  by  the  system  bandwidth). 

To  track  a  point  on  the  target  with  initial  position  (x,y),  the  matched  filtered  echo  (A-scan)  from  the 
target  is  evaluated  at  range  xcos0+ysin9  as  the  target  is  rotated.  Equation  (A22)  describes  a  sum  of  high 
pass,  matched  filtered  echoes  from  the  point  on  the  target  at  initial  position  (x,y)  as  the  target  is  rotated. 

The  same  A-scan  data  can  be  obtained  by  moving  the  transducer  in  a  circle  around  the  target,  or  by 
using  a  large  array  of  transducers  that  are  arranged  in  a  circle  with  the  target  at  the  center.  The  second 
alternative  is  an  actual  array,  while  the  first  is  a  synthetic  array.  The  array  is  focused  on  a  target  point  by 
delay-and-sum  beam  forming.  Consider  a  transducer  that  is  located  on  a  circle,  0  radians  counterclock¬ 
wise  relative  to  the  negative  x-axis.  A  signal  is  transmitted  toward  the  target  from  this  transducer,  and  the 
resulting  echo  is  received  by  the  same  transducer  and  matched  filtered  or  otherwise  processed  to  estimate 
target  impulse  response.  The  contribution  of  this  filtered  transducer  output  to  the  beam  former  image  of 
the  target  point  at  x,y  is  a  sample  of  the  matched  filtered  echo.  This  sample  is  chosen  to  correspond  to  the 
range  of  the  target  point  (i.e.,  to  a  range  of  xcos0+ysin0  when  range  zero  is  at  the  center  of  the  circle). 
The  delay-and-sum  beam  former  for  the  real  or  synthetic  array  approximates  the  integral  in  equation 
(A22)  by  a  finite  sum  over  a  sequence  of  aspect  (0)  values.  Such  a  finite  sum  approximation  is  also  used 
in  back  projection.  If  the  matched  filtered  echoes  are  high-pass  filtered  by  using  a  filter  with  transfer 
function,  tool,  synthetic  aperture  imaging  and  back  projection  are  equivalent  processes. 

One  way  to  exploit  the  equivalence  of  SAS  and  BP  is  to  form  a  3-D  image  when  the  transducer  is 
above  the  plane  of  rotation.  If  the  transducer  is  located  above  the  negative  x-axis  such  that  the  line 
between  the  transducer  and  the  origin  forms  an  angle,  a,  relative  to  the  negative  x-axis,  and  if  the  target 
is  in  the  far  field  of  the  transducer,  then  equation  (A22)  becomes 

Jt/2 

f(x,y,z)  =  (2tt)_1  JpeHP[(xcos0  +  ysin0)cosa-zsina]d0.  (A23) 

-ir/2 
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